104 results found.
Written
Language Modeling Tool,
Language Type:
Bilingual
Languages:
Hindi Panjabi
Availability:
Freely Available
License:
MIT
Size:
78.03 MByte Production Status:
Newly created-finished
Use:
Machine Learning
-
Paper title:Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in {H}indi and {P}unjabi
-
Paper track:Short/Phonology, Morphology and Word Segmentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Aryaman Arora | Schwa Deletion Model for Hindi and Punjabi | /N |
Documentation:
None
Written
Lexicon,
Language Type:
Monolingual
Languages:
Hindi
Availability:
From Owner
License:
Copyright © 1993 R. S. McGregor
Size:
34952 lexemes Production Status:
Existing-used
Use:
Machine Learning
-
Paper title:Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in {H}indi and {P}unjabi
-
Paper track:Short/Phonology, Morphology and Word Segmentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Aryaman Arora | Hindi Pronunciation Dataset | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
Arabic Chinese English German Hindi Spanish Vietnamese
Availability:
Freely Available
License:
Size:
50+ GByte Production Status:
Existing-used
Use:
Machine Learning
-
Paper title:MLQA: Evaluating Cross-lingual Extractive Question Answering
-
Paper track:Long/Question Answering
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Patrick Lewis | Wikipedia | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
Hindi
Availability:
From Owner
License:
Size:
4800 wordsProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:The Four-way Classification of Stops with Voicing and Aspiration for Non-native Speech Evaluation
-
Paper track:2.3 Acoustic phonetics/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Preeti Rao | Hindi word utterances for word-initial plosives | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
Hindi
Availability:
Freely Available
License:
CC-BY-SA-NC
Size:
<Not Specified> Production Status:
Newly created-in progress
Use:
Language Modelling
-
Paper title:HindEnCorp - Hindi-English and Hindi-only Corpus for Machine Translation
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Ondřej Bojar | Charles University in Prague, Faculty of Mathematics and Physics | CZ |
| Author 2 | Vojtěch Diatka | Charles University in Prague, Faculty of Arts, Department of Linguistics | CZ |
| Author 3 | Pavel Rychlý | NLP Centre, Faculty of Informatics, Masaryk University, Brno, Czech Republic | CZ |
| Author 4 | Pavel Straňák | Charles University in Prague | CZ |
| Author 5 | Vit Suchomel | Natural Language Processing Centre, Masaryk University | CZ |
| Author 6 | Aleš Tamchyna | Charles University in Prague, UFAL MFF | CZ |
| Author 7 | Daniel Zeman | Charles University in Prague, Faculty of Mathematics and Physics | CZ |
| Main Contact | Vojtěch Diatka | Charles University in Prague, Faculty of Arts, Department of Linguistics | None |
Documentation:
http://ufal.mff.cuni.cz/hindencorp/Language Type:
Multilingual
Languages:
Arabic Chinese English Finnish Hindi
Availability:
Freely Available
License:
BSD 3
Size:
<Not Specified> <Not Specified>Production Status:
Existing-updated
Use:
Corpus Creation/Annotation
-
Paper title:Gold Standard Annotations for Preposition and Verb Sense with Semantic Role Labels in Adult-Child Interactions
-
Paper track:Resource paper
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Lori Moon | University of Illinois at Urbana-Champaign | None |
| Author 2 | Christos Christodoulopoulos | Amazon | GB |
| Author 3 | Fisher Cynthia | University of Illinois | US |
| Author 4 | Sandra Franco | University of Illinois at Urbana-Champaign | N/A |
| Author 5 | Dan Roth | University of Illinois | US |
| Main Contact | Lori Moon | University of Illinois at Urbana-Champaign | None |
Documentation:
https://www.colorado.edu/ics/sites/default/files/attached-files/techreport02-09-jubilee.pdf
Written
Treebank,
Language Type:
Monolingual
Languages:
Bengali Chinese English Filipino Hindi Indonesian Japanese Khmer Lao Malay Myanmar Thai Vietnamese
Availability:
Freely Available
License:
CreativeCommons
Size:
20106 sentences Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Improving Low-Resource NMT through Relevance Based Linguistic Features Incorporation
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Abhisek Chakrabarty | Asian Language Treebank Parallel Corpus | /N |
Documentation:
http://www2.nict.go.jp/astrec-att/member/mutiyama/ALT/ALT-Parallel-Corpus-20191206/README.txt
Written
Corpus,
Language Type:
Multilingual
Languages:
Afrikaans Albanian Amharic Arabic Aragonese Armenian Assamese Azerbaijani Basque Belarusian Bengali Bosnian Breton Bulgarian Burmese Catalan Central Khmer Chinese Croatian Czech Danish Dutch Dzongkha English Esperanto Estonian Finnish French Gaelic Galician Georgian German Greek Gujarati Hausa Hebrew Hindi Hungarian Icelandic Igbo Indonesian Irish Italian Japanese Kannada Kazakh Kinyarwanda Korean Kurdish Kyrgyz Latvian Limburgan Lithuanian Macedonian Malagasy Malay Malayalam Maltese Marathi Mongolian Nepali Northern Sami Norwegian Norwegian Bokmål Norwegian Nynorsk Occitan Oriya Panjabi Pashto Persian Polish Portuguese Romanian Russian Serbian Serbo-Croatian Sinhala Slovak Slovenian Spanish Swedish Tajik Tamil Tatar Telugu Thai Turkish Turkmen Uighur Ukrainian Urdu Uzbek Vietnamese Walloon Welsh Western Frisian Xhosa Yiddish Yoruba Zulu
Availability:
Freely Available
License:
Size:
55 million sentences Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation
-
Paper track:Long/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Biao Zhang | the open parallel corpus (OPUS) | /N |
Documentation:
None
Not Applicable
Contextualsed word embeddings,
Language Type:
Monolingual
Languages:
Ancient Arabic Basque Bokmål Bulgarian Catalan Chinese Church Croatian Czech Danish Dutch English Estonian Finnish French Galician German Greek Hebrew Hindi Hungarian Indonesian Irish Italian Japanese Korean Latin Latvian Norwegian Nynorsk Old Persian Polish Portuguese Romanian Russian Simplified Chinese Slavonic Slovak Slovene Spanish Swedish Turkish Ukrainian Urdu Uyghur Vietnamese
Availability:
Freely Available
License:
none
Size:
18.4 GByte Production Status:
Existing-used
Use:
Parsing and Tagging
-
Paper title:Treebank Embedding Vectors for Out-of-domain Dependency Parsing
-
Paper track:Short/Syntax: Tagging, Chunking and Parsing
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Joachim Wagner | Elmo For Many Languages | /N |
Documentation:
https://www.aclweb.org/anthology/K18-2005/
Speech
Corpus,
Language Type:
Monolingual
Languages:
Arabic Bengali Central Khmer Chinese Dari Egyptian Arabic English Georgian Hindi Iranian Persian Italian Japanese Korean Lao Mandarin Chinese Min Nan Chinese Moroccan Arabic Northern Khmer Panjabi Persian Russian Spanish Tagalog Thai Tigrinya Urdu Uzbek Vietnamese Wu Chinese Yue Chinese
Availability:
From Data Center(s)
License:
LDC
Size:
None Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:End-to-End Neural Speaker Diarization with Permutation-Free Objectives
-
Paper track:4.5 Speaker diarization/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yusuke Fujita | 2008 NIST Speaker Recognition Evaluation | /N |
Documentation:
None




